Fix GEMM dispatch for complex-real matmul#1520
Conversation
|
Is this redirecting to the "reinterpret-trick"? Did that get lost over the more recent changes? |
|
just for posterity if it happens to be useful, I had just bisected this to JuliaLang/julia#55002 |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1520 +/- ##
==========================================
+ Coverage 93.94% 94.03% +0.09%
==========================================
Files 34 34
Lines 15984 15980 -4
==========================================
+ Hits 15016 15027 +11
+ Misses 968 953 -15 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@dkarrasch Exactly, this had regressed from the previous behavior. |
|
@adienes Thanks for the bisect! It appears that the fix required is a bit different. The method already existed, but the BLAS flag format had changed between releases and hadn't been updated. |
| α::Number, β::Number, ::Val{BlasFlag.GEMM}) where {T<:BlasReal} | ||
| gemm_wrapper!(C, tA, tB, A, B, α, β) | ||
| end | ||
| Base.@constprop :aggressive function generic_matmatmul_wrapper!(C::StridedVecOrMat{Complex{T}}, tA, tB, A::StridedVecOrMat{Complex{T}}, B::StridedVecOrMat{T}, |
There was a problem hiding this comment.
I don't think this method is ever taken. BlasReal matrices would always use GEMM as the flag, and gemm_wrapper! may delegate to _generic_matmatmul! if necessary.
This should fix #1519. The issue on master is that we have specialized dispatches for arrays of the same eltype, and the complex-real matmul ends up in `generic_matmatmul!`. This adds an extra method to ensure that the complex-real case also reaches BLAS. ```julia julia> A = ones(ComplexF64, 400, 400); B = ones(size(A)); C = similar(A); julia> @Btime mul!($C, $A, $B); 57.043 ms (0 allocations: 0 bytes) # master 1.608 ms (0 allocations: 0 bytes) # this PR ```
This should fix #1519. The issue on master is that we have specialized dispatches for arrays of the same eltype, and the complex-real matmul ends up in `generic_matmatmul!`. This adds an extra method to ensure that the complex-real case also reaches BLAS. ```julia julia> A = ones(ComplexF64, 400, 400); B = ones(size(A)); C = similar(A); julia> @Btime mul!($C, $A, $B); 57.043 ms (0 allocations: 0 bytes) # master 1.608 ms (0 allocations: 0 bytes) # this PR ```
This should fix #1519. The issue on master is that we have specialized dispatches for arrays of the same eltype, and the complex-real matmul ends up in
generic_matmatmul!. This adds an extra method to ensure that the complex-real case also reaches BLAS.